AAAI.2022 - Demonstration Track

Total: 30

#1 Building Goal-Oriented Dialogue Systems with Situated Visual Context [PDF] [Copy] [Kimi]

Authors: Sanchit Agarwal ; Jan Jezabek ; Arijit Biswas ; Emre Barut ; Bill Gao ; Tagyoung Chung

Goal-oriented dialogue agents can comfortably utilize the conversational context and understand its users' goals. However, in visually driven user experiences, these conversational agents are also required to make sense of the screen context in order to provide a proper interactive experience. In this paper, we propose a novel multimodal conversational framework where the dialogue agent's next action and their arguments are derived jointly conditioned both on the conversational and the visual context. We demonstrate the proposed approach via a prototypical furniture shopping experience for a multimodal virtual assistant.

#2 PYLON: A PyTorch Framework for Learning with Constraints [PDF] [Copy] [Kimi]

Authors: Kareem Ahmed ; Tao Li ; Thy Ton ; Quan Guo ; Kai-Wei Chang ; Parisa Kordjamshidi ; Vivek Srikumar ; Guy Van den Broeck ; Sameer Singh

Deep learning excels at learning task information from large amounts of data, but struggles with learning from declarative high-level knowledge that can be more succinctly expressed directly. In this work, we introduce PYLON, a neuro-symbolic training framework that builds on PyTorch to augment procedurally trained models with declaratively specified knowledge. PYLON lets users programmatically specify constraints as Python functions and compiles them into a differentiable loss, thus training predictive models that fit the data whilst satisfying the specified constraints. PYLON includes both exact as well as approximate compilers to efficiently compute the loss, employing fuzzy logic, sampling methods, and circuits, ensuring scalability even to complex models and constraints. Crucially, a guiding principle in designing PYLON is the ease with which any existing deep learning codebase can be extended to learn from constraints in a few lines code: a function that expresses the constraint, and a single line to compile it into a loss. Our demo comprises of models in NLP, computer vision, logical games, and knowledge graphs that can be interactively trained using constraints as supervision.

#3 A Goal-Driven Natural Language Interface for Creating Application Integration Workflows [PDF] [Copy] [Kimi]

Authors: Michelle Brachman ; Christopher Bygrave ; Tathagata Chakraborti ; Arunima Chaudhary ; Zhining Ding ; Casey Dugan ; David Gros ; Thomas Gschwind ; James Johnson ; Jim Laredo ; Christoph Miksovic ; Qian Pan ; Priyanshu Rai ; Ramkumar Ramalingam ; Paolo Scotton ; Nagarjuna Surabathina ; Kartik Talamadupula

Web applications and services are increasingly important in a distributed internet filled with diverse cloud services and applications, each of which enable the completion of narrowly defined tasks. Given the explosion in the scale and diversity of such services, their composition and integration for achieving complex user goals remains a challenging task for end-users and requires a lot of development effort when specified by hand. We present a demonstration of the Goal Oriented Flow Assistant (GOFA) system, which provides a natural language solution to generate workflows for application integration. Our tool is built on a three-step pipeline: it first uses Abstract Meaning Representation (AMR) to parse utterances; it then uses a knowledge graph to validate candidates; and finally uses an AI planner to compose the candidate flow. We provide a video demonstration of the deployed system as part of our submission.

#4 UCSM-DNN: User and Card Style Modeling with Deep Neural Networks for Personalized Game AI [PDF] [Copy] [Kimi]

Authors: Daegeun Choe ; Youngbak Jo ; Shindong Kang ; Shounan An ; Insoo Oh

This paper tries to resolve long waiting time to find a matching person in player versus player mode of online sports games, such as baseball, soccer and basketball. In player versus player mode, game playing AI which is instead of player needs to be not just smart as human but also show variety to improve user experience against AI. Therefore a need to design game playing AI agents with diverse personalized styles rises. To this end, we propose a personalized game AI which encodes user style vectors and card style vectors with a general DNN, named UCSM-DNN. Extensive experiments show that UCSM-DNN shows improved performance in terms of personalized styles, which enrich user experiences. UCSM-DNN has already been integrated into popular mobile baseball game: MaguMagu 2021 as personalized game AI.

#5 AI Assisted Data Labeling with Interactive Auto Label [PDF] [Copy] [Kimi]

Authors: Michael Desmond ; Michelle Brachman ; Evelyn Duesterwald ; Casey Dugan ; Narendra Nath Joshi ; Qian Pan ; Carolina Spina

We demonstrate an AI assisted data labeling system which applies unsupervised and semi-supervised machine learning to facilitate accurate and efficient labeling of large data sets. Our system (1) applies representative data sampling and active learning in order to seed and maintain a semi-supervised learner that assists the human labeler (2) provides visual labeling assistance and optimizes labeling mechanics using predicted labels (3) seamlessly updates and learns from ongoing human labeling activity (4) captures and presents metrics that indicate the quality of labeling assistance, and (5) provides an interactive auto labeling interface to group, review and apply predicted labels in a scalable manner.

#6 CrowdFL: A Marketplace for Crowdsourced Federated Learning [PDF] [Copy] [Kimi]

Authors: Daifei Feng ; Cicilia Helena ; Wei Yang Bryan Lim ; Jer Shyuan Ng ; Hongchao Jiang ; Zehui Xiong ; Jiawen Kang ; Han Yu ; Dusit Niyato ; Chunyan Miao

Amid data privacy concerns, Federated Learning (FL) has emerged as a promising machine learning paradigm that enables privacy-preserving collaborative model training. However, there exists a need for a platform that matches data owners (supply) with model requesters (demand). In this paper, we present CrowdFL, a platform to facilitate the crowdsourcing of FL model training. It coordinates client selection, model training, and reputation management, which are essential steps for the FL crowdsourcing operations. By implementing model training on actual mobile devices, we demonstrate that the platform improves model performance and training efficiency. To the best of our knowledge, it is the first platform to support crowdsourcing-based FL on edge devices.

#7 CCA: An ML Pipeline for Cloud Anomaly Troubleshooting [PDF] [Copy] [Kimi]

Authors: Lili Georgieva ; Ioana Giurgiu ; Serge Monney ; Haris Pozidis ; Viviane Potocnik ; Mitch Gusat

Cloud Causality Analyzer (CCA) is an ML-based analytical pipeline to automate the tedious process of Root Cause Analysis (RCA) of Cloud IT events. The 3-stage pipeline is composed of 9 functional modules, including dimensionality reduction (feature engineering, selection and compression), embedded anomaly detection, and an ensemble of 3 custom explainability and causality models for Cloud Key Performance Indicators (KPI). Our challenge is: How to apply a reduced (sub)set of judiciously selected KPIs to detect Cloud performance anomalies, and their respective root causal culprits, all without compromising accuracy?

#8 SenSE: A Toolkit for Semantic Change Exploration via Word Embedding Alignment [PDF] [Copy] [Kimi]

Authors: Maurício Gruppi ; Sibel Adalı ; Pin-Yu Chen

Lexical Semantic Change (LSC) detection, also known as Semantic Shift, is the process of identifying and characterizing variations in language usage across different scenarios such as time and domain. It allows us to track the evolution of word senses, as well as to understand the difference between the language used in distinct communities. LSC detection is often done by applying a distance measure over vectors of two aligned word embedding matrices. In this demonstration, we present SenSE, an interactive semantic shift exploration toolkit that provides visualization and explanation of lexical semantic change for an input pair of text sources. Our system focuses on showing how the different alignment strategies may affect the output of an LSC model as well as on explaining semantic change based on the neighbors of a chosen target word, while also extracting examples of sentences where these semantic deviations appear. The system runs as a web application (available at http://sense.mgruppi.me), allowing the audience to interact by configuring the alignment strategies while visualizing the results in a web browser.

#9 Dynamic Incentive Mechanism Design for COVID-19 Social Distancing [PDF] [Copy] [Kimi]

Authors: Xuan Rong Zane Ho ; Wei Yang Bryan Lim ; Hongchao Jiang ; Jer Shyuan Ng ; Han Yu ; Zehui Xiong ; Dusit Niyato ; Chunyan Miao

As countries enter the endemic phase of COVID-19, people's risk of exposure to the virus is greater than ever. There is a need to make more informed decisions in our daily lives on avoiding crowded places. Crowd monitoring systems typically require costly infrastructure. We propose a crowd-sourced crowd monitoring platform which leverages user inputs to generate crowd counts and forecast location crowdedness. A key challenge for crowd-sourcing is a lack of incentive for users to contribute. We propose a Reinforcement Learning based dynamic incentive mechanism to optimally allocate rewards to encourage user participation.

#10 MONICA2: Mobile Neural Voice Command Assistants towards Smaller and Smarter [PDF] [Copy] [Kimi]

Authors: Yoonseok Hong ; Shounan An ; Sunwoo Im ; Jaegeon Jo ; Insoo Oh

In this paper, we propose on-device voice command assistants for mobile games to increase user experiences even in hands-busy situations such as driving and cooking. Since most of the current mobile games cost large memory (e.g. more than 1GB memory), so it is necessary to reduce memory usage further to integrate voice commands systems on mobile clients. Therefore a need to design an on-device automatic speech recognition system that costs minimal memory and CPU resources rises. To this end, we apply cross layer parameter sharing to Conformer, named MONICA2 which results in lower memory usage for on-device speech recognition. MONICA2 reduces the number of parameters of deep neural network by 58%, with minimal recognition accuracy degradation measured in word error rate on Librispeech benchmark. As an on-device voice command user interface, MONICA2 costs only 12.8MB mobile memory and the average inference time for 3-seconds voice command is about 30ms, which is profiled in Samsung Galaxy S9. As far as we know, MONICA2 is the most memory efficient yet accurate on-device speech recognition which could be applied to various applications such as mobile games, IoT devices, etc.

#11 A Trend-Driven Fashion Design System for Rapid Response Marketing in E-commerce [PDF] [Copy] [Kimi]

Authors: Lianghua Huang ; Yu Liu ; Bin Wang ; Pan Pan ; Rong Jin

Fashion is the form we express ourselves to the world and has grown into one of the largest industries in the world. Despite the significant evolvement of the fashion industry over the past decades, it is still a great challenge to respond to the diverse preferences of a large number of different consumers in time and accurately. To deal with the problem, we present an innovative demonstration of a trend-driven fashion design system using deep generative modeling, which enables automatic fashion design and editing based on trend reports. Our system consists of three components, including trend-driven fashion design, interactive fashion editing, and popularity estimation. The system offers a unified framework for the mass production of fashion designs that conform to the trend, which helps businesses better respond to market demands.

#12 InteractEva: A Simulation-Based Evaluation Framework for Interactive AI Systems [PDF] [Copy] [Kimi]

Authors: Yannis Katsis ; Maeda F. Hanafi ; Martín Santillán Cooper ; Yunyao Li

Evaluating interactive AI (IAI) systems is a challenging task, as their output highly depends on the performed user actions. As a result, developers often depend on limited and mostly qualitative data derived from user testing to improve their systems. In this paper, we present InteractEva; a systematic evaluation framework for IAI systems. InteractEva employs (a) a user simulation backend to test the system against different use cases and user interactions at scale with (b) an interactive frontend allowing developers to perform important quantitative evaluation tasks, including acquiring a performance overview, performing error analysis, and conducting what-if studies. The framework has supported the evaluation and improvement of an industrial IAI text extraction system, results of which will be presented during our demonstration.

#13 ALLURE: A Multi-Modal Guided Environment for Helping Children Learn to Solve a Rubik’s Cube with Automatic Solving and Interactive Explanations [PDF] [Copy] [Kimi]

Authors: Kausik Lakkaraju ; Thahimum Hassan ; Vedant Khandelwal ; Prathamjeet Singh ; Cassidy Bradley ; Ronak Shah ; Forest Agostinelli ; Biplav Srivastava ; Dezhi Wu

Modern artificial intelligence (AI) methods have been used to solve problems that many humans struggle to solve. This opens up new opportunities for knowledge discovery and education. We demonstrate ALLURE, an educational AI system for learning to solve the Rubik’s cube that is designed to help students improve their problem solving skills. ALLURE can both find and explain its own strategies for solving the Rubik’s cube as well as build on user-provided strategies. Collaboration between AI and user happens using visual and natural language modalities.

#14 MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers [PDF] [Copy] [Kimi]

Authors: Yihuai Lan ; Lei Wang ; Qiyuan Zhang ; Yunshi Lan ; Bing Tian Dai ; Yan Wang ; Dongxiang Zhang ; Ee-Peng Lim

While Math Word Problem (MWP) solving has emerged as a popular field of study and made great progress in recent years, most existing methods are benchmarked solely on one or two datasets and implemented with different configurations. In this paper, we introduce the first open-source library for solving MWPs called MWPToolkit, which provides a unified, comprehensive, and extensible framework for the research purpose. Specifically, we deploy 17 deep learning-based MWP solvers and 6 MWP datasets in our toolkit. These MWP solvers are advanced models for MWP solving, covering the categories of Seq2seq, Seq2Tree, Graph2Tree, and Pre-trained Language Models. And these MWP datasets are popular datasets that are commonly used as benchmarks in existing work. Our toolkit is featured with highly modularized and reusable components, which can help researchers quickly get started and develop their own models. We have released the code and documentation of MWPToolkit in https://github.com/LYH-YF/MWPToolkit.

#15 SenTag: A Web-Based Tool for Semantic Annotation of Textual Documents [PDF] [Copy] [Kimi]

Authors: Andrea Loreggia ; Simone Mosco ; Alberto Zerbinati

In this work, we present SenTag, a lightweight web-based tool focused on semantic annotation of textual documents. The platform allows multiple users to work on a corpus of documents. The tool enables to tag a corpus of documents through an intuitive and easy-to-use user interface that adopts the Extensible Markup Language (XML) as output format. The main goal of the application is two-fold: facilitating the tagging process and reducing or avoiding errors in the output documents. It allows also to identify arguments and other entities that are used to build an arguments graph. It is also possible to assess the level of agreement of annotators working on a corpus of text.

#16 Silence or Outbreak – a Real-Time Emergent Topic Identification System (RealTIS) for Social Media [PDF] [Copy] [Kimi]

Authors: Ning Lu ; Zhen Yang ; Jian Huang ; Yaxi Wu ; Hesong Wang

This paper presents RealTIS, a Real-time emergent Topic Identification System for user-generated content on the web via social networking services such as Twitter, Weibo, and Facebook. Without user intervention, our proposed RealTIS system can efficiently collect necessary social media posts, construct a quality topic summarization from the vast sea of data, and then automatically identify whether the emerging topics will be out-breaking or just fading into silence. RealTIS uses a time-sliding window to compute the statistics about the basic structure (motifs) variation of the propagation network for a specific topic. These statistics are then used to predict unusual shifts in correlations, make early warning and detect outbreak. Besides, this work also illustrates the mechanism by which our proposed system makes early warning happen.

#17 SWWS: A Smart Wildlife Warning Sign System [PDF] [Copy] [Kimi]

Author: Alan Ma

Every year in the US, millions of animals are run over by vehicles making wildlife vehicle collisions a real danger to both animals and human. In addition, road networks be-come abiotic barriers to wildlife migration between regions creating ripple effects on ecosystems. In this paper, a smart wildlife warning sign system (SWWS) is demonstrated, utilizing the technologies of Internet of Things, image recognition, data processing and visualization. This smart sign system is intended to prevent roadkill by warning drivers to slow down once sensors are triggered and simultaneously capture animal images via infrared cam-era. Data collection is conducted through local neural network model identification of wildlife images and saved along with metadata based on animal activity occurrence. Wildlife activity data can be exported wirelessly to cloud database to assist ecologists and government road agencies to investigate and analyze the wildlife activity and migration patterns over time.

#18 An End-to-End Traditional Chinese Medicine Constitution Assessment System Based on Multimodal Clinical Feature Representation and Fusion [PDF] [Copy] [Kimi]

Authors: Huisheng Mao ; Baozheng Zhang ; Hua Xu ; Kai Gao

Traditional Chinese Medicine (TCM) constitution is a fundamental concept in TCM theory. It is determined by multimodal TCM clinical features which, in turn, are obtained from TCM clinical information of image (face, tongue, etc.), audio (pulse and voice), and text (inquiry) modality. The auto assessment of TCM constitution is faced with two major challenges: (1) learning discriminative TCM clinical feature representations; (2) jointly processing the features using multimodal fusion techniques. The TCM Constitution Assessment System (TCM-CAS) is proposed to provide an end-to-end solution to this task, along with auxiliary functions to aid TCM researchers. To improve the results of TCM constitution prediction, the system combines multiple machine learning algorithms such as facial landmark detection, image segmentation, graph neural networks and multimodal fusion. Extensive experiments are conducted on a four-category multimodal TCM constitution dataset, and the proposed method achieves state-of-the-art accuracy. Provided with datasets containing annotations of diseases, the system can also perform automatic disease diagnosis from a TCM perspective.

#19 A Demonstration of Compositional, Hierarchical Interactive Task Learning [PDF] [Copy] [Kimi]

Authors: Aaron Mininger ; John E. Laird

We present a demonstration of the interactive task learning agent Rosie, where it learns the task of patrolling a simulated barracks environment through situated natural language instruction. In doing so, it builds a sizable task hierarchy composed of both innate and learned tasks, tasks formulated as achieving a goal or following a procedure, tasks with conditional branches and loops, and involving communicative and mental actions. Rosie is implemented in the Soar cognitive architecture, and represents tasks using a declarative task network which it compiles into procedural rules through chunking. This is key to allowing it to learn from a single training episode and generalize quickly.

#20 Smart Out-of-Home Advertising Using Artificial Intelligence and GIS Data [PDF] [Copy] [Kimi]

Authors: Nader Nader ; Rafael Alexandrou ; Iasonas Iasonas ; Andreas Pamboris ; Harris Papadopoulos ; Andreas Konstantinidis

This demonstration paper introduces the Smart Out-of-Home Advertising Platform (SOAP), which leverages Geographic Information Systems (GIS) data and state-of-the-art Artificial Intelligence (AI) approaches to provide: (i) a documented, data-informed pricing model for billboards, which can be used to justify billboard prices to advertisers; and (ii) a set of non-dominated solutions (each corresponding to a different allocation of billboards to a given campaign) that explores the trade-offs between multiple conflicting objectives (e.g., cost and coverage). To the best of our knowledge, SOAP is the first to tackle such challenges in the context of Multi Objective Optimization (MOO).

#21 AnomalyKiTS: Anomaly Detection Toolkit for Time Series [PDF] [Copy] [Kimi]

Authors: Dhaval Patel ; Giridhar Ganapavarapu ; Srideepika Jayaraman ; Shuxin Lin ; Anuradha Bhamidipaty ; Jayant Kalagnanam

This demo paper presents a design and implementation of a system AnomalyKiTS for detecting anomalies from time series data for the purpose of offering a broad range of algorithms to the end user, with special focus on unsupervised/semi-supervised learning. Given an input time series, AnomalyKiTS provides four categories of model building capabilities followed by an enrichment module that helps to label anomaly. AnomalyKiTS also supports a wide range of execution engines to meet the diverse need of anomaly workloads such as Serveless for CPU intensive work, GPU for deep-learning model training, etc.

#22 EasySM: A Data-Driven Intelligent Decision Support System for Server Merge [PDF] [Copy] [Kimi]

Authors: Manhu Qu ; Jie Huang ; Hao Deng ; Runze Wu ; Xudong Shen ; Jianrong Tao ; Tangjie Lv

As an independent social and economic entity, game servers plays a dominant role in building a stable, living, and attractive virtual world in massive multi-player online role-playing games (MMORPGs). We propose and implement a novel intelligent decision support system for server merge (SM) for maintaining the game ecology at the macro level. The services provided by this system include server health diagnosis, server merge assessment, and combination strategy recommendation. Specifically, we design an effective time series prediction algorithm to diagnose the health status of one server (e.g., user activity, online time, daily revenue) based on real game scenarios, and then select the servers with poor status from all servers. Moreover, to dig out the inherent development laws of servers from the historical merge records, we leverage a correlation measurement algorithm to find the historical merged servers that are similar to the servers to be merged and then evaluate the potential trend after merging, which can assist experts to make reasonable decisions. We deploy our system into practice for multiple MMORPGs and achieve sound online performance endorsed by the game operation team.

#23 FORCE: A Framework of Rule-Based Conversational Recommender System [PDF] [Copy] [Kimi]

Authors: Jun Quan ; Ze Wei ; Qiang Gan ; Jingqi Yao ; Jingyi Lu ; Yuchen Dong ; Yiming Liu ; Yi Zeng ; Chao Zhang ; Yongzhi Li ; Huang Hu ; Yingying He ; Yang Yang ; Daxin Jiang

The conversational recommender systems (CRSs) have received extensive attention in recent years. However, most of the existing works focus on various deep learning models, which are largely limited by the requirement of large-scale human-annotated datasets. Such methods are not able to deal with the cold-start scenarios in industrial products. To alleviate the problem, we propose FORCE, a Framework Of Rule-based Conversational rEcommender system that helps developers to quickly build CRS bots by simple configuration. We conduct experiments on two datasets in different languages and domains to verify its effectiveness and usability.

#24 A Synthetic Prediction Market for Estimating Confidence in Published Work [PDF] [Copy] [Kimi]

Authors: Sarah Rajtmajer ; Christopher Griffin ; Jian Wu ; Robert Fraleigh ; Laxmaan Balaji ; Anna Squicciarini ; Anthony Kwasnica ; David Pennock ; Michael McLaughlin ; Timothy Fritton ; Nishanth Nakshatri ; Arjun Menon ; Sai Ajay Modukuri ; Rajal Nivargi ; Xin Wei ; C. Lee Giles

Explainably estimating confidence in published scholarly work offers opportunity for faster and more robust scientific progress. We develop a synthetic prediction market to assess the credibility of published claims in the social and behavioral sciences literature. We demonstrate our system and detail our findings using a collection of known replication projects. We suggest that this work lays the foundation for a research agenda that creatively uses AI for peer review.

#25 PantheonRL: A MARL Library for Dynamic Training Interactions [PDF] [Copy] [Kimi]

Authors: Bidipta Sarkar ; Aditi Talati ; Andy Shih ; Dorsa Sadigh

We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training. Our package is designed around flexible agent objects that can be easily configured to support different training interactions, and handles fully general multiagent environments with mixed rewards and n agents. Built on top of StableBaselines3, our package works directly with existing powerful deep RL algorithms. Finally, PantheonRL comes with an intuitive yet functional web user interface for configuring experiments and launching multiple asynchronous jobs. Our package can be found at https://github.com/Stanford-ILIAD/PantheonRL.